Search CORE

151 research outputs found

Gibberish, assistant, or master? Using tweets linking to news for extractive single-document summarization

Author: GAO Wei
WEI Zhongyu
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/08/2015
Field of study

Single-document summarization is a challenging task. In this paper, we explore effective ways using the tweets link-ing to news for generating extractive summary of each doc-ument. We reveal the very basic value of tweets that can be utilized by regarding every tweet as a vote for candidate sentences. Base on such finding, we resort to unsupervised summarization models by leveraging the linking tweets to master the ranking of candidate extracts via random walk on a heterogeneous graph. The advantage is that we can use the linking tweets to opportunistically “supervise ” the summa-rization with no need of reference summaries. Furthermore, we analyze the influence of the volume and latency of tweets on the quality of output summaries since tweets come af-ter news release. Compared to truly supervised summarizer unaware of tweets, our method achieves significantly better results with reasonably small tradeoff on latency; compared to the same using tweets as auxiliary features, our method is comparable while needing less tweets and much shorter time to achieve significant outperformance

CiteSeerX

Institutional Knowledge at Singapore Management University

Utilizing microblogs for improving automatic news high-lights extraction

Author: GAO Wei
WEI Zhongyu
Publication venue: 'Association for Computational Linguistics (ACL)'
Publication date: 01/08/2014
Field of study

Institutional Knowledge at Singapore Management University

Microblog search and filtering with time sensitive feedback and thresholding based on BM25

Author: GAO Wei
WEI Zhongyu
WONG Kam-Fai
Publication venue: NIST Special Publication: SP 500-298
Publication date: 01/11/2012
Field of study

Institutional Knowledge at Singapore Management University

DxFormer: A Decoupled Automatic Diagnostic System Based on Decoder-Encoder Transformer with Dense Symptom Representations

Author: Chen Wei
Peng Jiajie
Wei Zhongyu
Zhong Cheng
Publication venue
Publication date: 23/12/2022
Field of study

Diagnosis-oriented dialogue system queries the patient's health condition and makes predictions about possible diseases through continuous interaction with the patient. A few studies use reinforcement learning (RL) to learn the optimal policy from the joint action space of symptoms and diseases. However, existing RL (or Non-RL) methods cannot achieve sufficiently good prediction accuracy, still far from its upper limit. To address the problem, we propose a decoupled automatic diagnostic framework DxFormer, which divides the diagnosis process into two steps: symptom inquiry and disease diagnosis, where the transition from symptom inquiry to disease diagnosis is explicitly determined by the stopping criteria. In DxFormer, we treat each symptom as a token, and formalize the symptom inquiry and disease diagnosis to a language generation model and a sequence classification model respectively. We use the inverted version of Transformer, i.e., the decoder-encoder structure, to learn the representation of symptoms by jointly optimizing the reinforce reward and cross entropy loss. Extensive experiments on three public real-world datasets prove that our proposed model can effectively learn doctors' clinical experience and achieve the state-of-the-art results in terms of symptom recall and diagnostic accuracy.Comment: 7 pages, 4 figures, 3 table

arXiv.org e-Print Archive

Using tweets to help sentence compression for news highlights generation

Author: GAO Wei
LI Chen
LIU Yang
WEI Zhongyu
Publication venue: 'Association for Computational Linguistics (ACL)'
Publication date: 01/01/2015
Field of study

We explore using relevant tweets of a given news article to help sentence com-pression for generating compressive news highlights. We extend an unsupervised dependency-tree based sentence compres-sion approach by incorporating tweet in-formation to weight the tree edge in terms of informativeness and syntactic impor-tance. The experimental results on a pub-lic corpus that contains both news arti-cles and relevant tweets show that our pro-posed tweets guided sentence compres-sion method can improve the summariza-tion performance significantly compared to the baseline generic sentence compres-sion method.

CiteSeerX

Crossref

Institutional Knowledge at Singapore Management University

Exploring tweets normalization and query time sensitivity for Twitter search

Author: GAO Wei
LI Binyang
WEI Zhongyu
WONG Kam-Fai
ZHOU Lanjun
Publication venue: NIST Special Publication: SP 500-296
Publication date: 01/11/2011
Field of study

Institutional Knowledge at Singapore Management University

Using content-level structures for summarizing microblog repost trees

Author: GAO Wei
LI Jing
PENG Baolin
WEI Zhongyu
WONG Kam-Fai
Publication venue: 'Association for Computational Linguistics (ACL)'
Publication date: 01/01/2015
Field of study

A microblog repost tree provides strong clues on how an event described therein develops. To help social media users capture the main clues of events on mi-croblogging sites, we propose a novel re-post tree summarization framework by ef-fectively differentiating two kinds of mes-sages on repost trees called leaders and followers, which are derived from content-level structure information, i.e., contents of messages and the reposting relations. To this end, Conditional Random Fields (CRF) model is used to detect leaders across repost tree paths. We then present a variant of random-walk-based summariza-tion model to rank and select salient mes-sages based on the result of leader detec-tion. To reduce the error propagation cas-caded from leader detection, we improve the framework by enhancing the random walk with adjustment steps for sampling from leader probabilities given all the re-posting messages. For evaluation, we construct two annotated corpora, one for leader detection, and the other for repost tree summarization. Experimental results confirm the effectiveness of our method.

CiteSeerX

Crossref

Institutional Knowledge at Singapore Management University